home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Cream of the Crop 22
/
Cream of the Crop 22.iso
/
os2
/
tton1770.zip
/
templeton.cfg
< prev
next >
Wrap
Text File
|
1996-10-20
|
14KB
|
316 lines
# *************************************************************
# Templeton, copyright 1995, 1996 N.A. Krawetz
# All rights reserved.
# *************************************************************
# configuration for Templeton
#
# Lines beginning with a '#' are comments and are ignored.
# Lines should not be more than 80 characters.
# Operands in this file are in the form:
# parameter value
# The parameter is case insensitive, except where a text string or URL
# is required.
# Boolean values ("true" or "false") are case insensitive.
# Numeric values should be numbers -- non-numbers are regarded as 0.
# All other types of values ARE case sensitive.
# ******************** Registration ****************************
# Register: registration code
# Software that is registered contains a unique registration
# code. This code should be entered exactly as it is provided.
# If your site contains multiple registrations, you may list
# each registration code on a line starting with the
# key word "Register".
# Please read the licensing agreement for registration
# information.
# Register 12-34567-891011
# ******************* File System *****************************
# LocalPath: absolute path
# LocalPath informs the program where to store the downloaded files.
# IF this path is:
# LocalPath none
# THEN no files are generated. Only a log file containing the remote
# servers WWW map is created in the current directory.
#
# Currently, files should be stored in the root directory of the file system.
# For WWW servers, this is the server's root directory.
# (This limitation will be removed in future releases.)
# For DOS based machines, this path may include a drive letter:
# LocalPath e:\server.www\
#
# Either slash "/" or backslash "\" are valid for specifying a directory.
# The trailing slash or backslash is optional.
#
# This option is only used when the "Interactive" option is FALSE.
LocalPath /
# User: e-mail address
# In case of emergency, this is the person who is running the program
# and who should be contacted to stop the program from running.
# This MUST be a valid e-mail address, and SHOULD also be available with
# a "talk" command.
# As a side note, it is never a good idea to let automatic software run
# unsupervised (especially this type of software). The "User" should be
# available to read their e-mail at all times during the execution of this
# program.
# The default is the account running the program on the current machine.
# User webmaster@host.machine.org
# ********************* Restrictions *****************************
# RestrictHost: boolean
# This parameter informs the program not to leave the designated host. Links
# to machines not on the current host are not traversed.
RestrictHost TRUE
# RestrictPath: absolute path
# This parameter is only used when a host is restricted.
# When a host is restricted, a subpath on that host may also be restricted.
# Hypertext references to documents outside this subtree are not traversed.
# Either slash "/" or backslash "\" are valid for specifying a directory.
# The trailing slash or backslash is optional.
RestrictPath /
# RestrictDepth: numeric value
# Hyperlinks are travered in a breadth-first search. An unrestricted search
# may download an entire WWW server's data. By restricting the depth,
# only immediate portions of the server will be received.
# Images and non-href links are considered to be at the same depth as the
# document.
# A restricted depth of 0 means no restriction.
# The default is 1
RestrictDepth 1
# RemoveRestricted: boolean
# This parameter informs the program to remove untraversed links. Links to
# restricted machines or restricted depths are removed from the HTML file,
# but the visible test is still available (just not a hyperlink).
# The default value is FALSE.
RemoveRestricted FALSE
# Add: URL
# Place a specific URL on the list of URLs to process.
# Be aware that restrictions apply.
# Exclusion: boolean
# This parameter determines whether Templeton will support server provided
# robot exclusion files (robots.txt). Many servers maintain exclusion files
# to prevent robots from wandering around virtual directory trees, from
# retrieving very temporary or uncomplete files, or copyright materials. It
# is considered "polite" for web agents to obey the exclusion files when they
# exist. The default value, TRUE, means that robot exclusion files are obeyed.
# Setting Exclusion to FALSE will ignore robot exclusion files.
Exclusion TRUE
# Deny: URL
# The URL provided, as well as all subtrees or the URL, are not processed.
# Many times specific directory subtrees are not desirable. You can deny
# retrieval of these URL's using this setting.
# For example, to NOT retrieve the "archive" subtree of the host loco.com,
# you would specify:
# Deny http://loco.com/archive/
# If you do not include the trailing slash (http://loco.com/archive) then
# all subdirectories beginning with "archive" are not processed. This
# includes "archive.1", "archive.old", "archive_from_1994", etc.
# Multiple Deny statements may be specified.
# Allow: URL
# Similar to "Deny", "Allow" explicitly specifies that a subtree is
# retrievable. When used in conjunction with Deny URL, branches of a
# subtree may be specified for access, while other subtrees are ignored.
# Multiple Allow statements may be specified.
# Sleep: numeric
# Sleep determines the number of seconds to pause before sending a request to
# a WWW server. SLEEP IS IMPORTANT.
# Warning: Templeton can generate thousands of requests per minute. Many
# WWW servers cannot handle a sudden onslaught of requests. Setting the
# Sleep parameter to 0 (zero) may generate too many requests for the server
# and kill the server. This is bad.
# A sleep setting of 0 (zero) is known to kill the following types of servers:
# All WWW servers that run under Microsoft Windows (TM)
# Old generation (HTML/1.0) CERN servers on all platforms
# Low sleep values may also generate large amounts of network traffic and
# hog network resources.
# For safety, you should set the sleep interval to at least 5 seconds.
# The longer, the better. Remember, this program is automated and can
# easily run for hours. What's the rush?
Sleep 10
# ********************* Network *****************************
# ProxyHost: hostname or IP address
# Proxy agents are machines that act as a gateway through a firewall.
# If your local network uses a proxy agent, specify the name of
# the proxy agent here. If you are uncertain about your network, consult your
# network manager or provider.
# A proxy server is only used when a server is specified.
# ProxyHost proxyhost.network.net
# ProxyPort: integer
# When using a proxy server (see ProxyHost), the port on the proxy server
# should be specified. The default port is 80. This values is not
# used if no proxy host is specified with ProxyHost.
ProxyPort 80
# Spoof: text-string
# Some WWW servers make incorrect assumptions about the browser/robots. (Most
# of these are the Netscape servers.) These servers assume that, since the
# browser is not "Netscape" the browser cannot handle the HTML documents and
# therefore, the document is not transfered. By "spoofing" a different name,
# the WWW robot can use a qualified browser name to retrieve the HTML
# document.
# NOTE: The first word of the spoof-name is used for restrictions when
# robot exclusion is honored (see Exclusion). This means, if Templeton tells
# the WWW server that it is "Netscape" and the server does not permit
# Netscape browsers, then the server will also not permit Templeton.
# Common spoof names (and browsers) are:
# Mozilla Netscape Browser
# WebCrawler WebCrawler robot
# InfoSeek InfoSeek robot
# WebExplorer IBM WebExplorer for OS/2
# Harvest a web robot
# Mosaic NCSA Mosaic
# Lynx Lynx, text browser
# Microsoft Internet Explorer
# PRODIGY-WB Prodigy browser
# Spoof Mozilla (Templeton)
# ********************* Preferences *****************************
# FATFormat: boolean
# Determines the file name format for the current operating system.
# DOS based machines using drives formatted with a File Allocation Table (FAT)
# can only handle file names containing 8 characters and a 3 character
# extension. Setting this option to TRUE will generate 8.3 character file
# names. The default is FALSE, and will generate unlimited length file names.
# NOTE: Under DOS, this option is always TRUE (DOS only supports FAT file
# names). Under OS/2, this value becomes TRUE automatically if the destination
# path (LocalPath) is located on a FAT partition.
FATFormat FALSE
# FileOverwrite: boolean
# Files that already exist on the local system are not normally downloaded.
# Setting the FileOverwrite option to TRUE will overwrite files on the
# local file system. Default value is FALSE.
FileOverwrite TRUE
# Index: file name
# For hypertext references that only specify a directory, this is the
# default html file in the directory.
# NOTE: if FATFormat is TRUE, the 8.3 name translation will be applied to
# this file name.
# The default name is "index.html"
Index index.html
# ISMAP: absolute path to executable
# For WWW servers, many imagemaps use a program that takes coordinates from
# a selected image <IMG SRC=... ISMAP> and return a new URL. Some of the
# more common methods use a data file containing known coordinates and a
# program to identify which URL is activated. Commonly, this program is
# called "imagemap" or "imagemap.exe".
# The ISMAP parameter specifies the WWW server's path to the imagemap program.
ISMAP /cgi-bin/imagemap
# MapType: NCSA or CERN
# For the executable specified in the ISMAP parameter (see above), this
# option determines the format of the file. If the image map file can be
# retrieved, then it is converted into this specified format.
# Valid options are either "CERN" or "NCSA". The default is NCSA.
MapType NCSA
# ********************* Logging *****************************
# Mailto-File: file name
# Similar to "Server-File" logging, the file name listed on the "Mailto-File"
# line contains a list of e-mail addresses found in the HTML documents. Only
# e-mail addresses that are active (hyperlinks) are used. E-mail addresses
# displayed as plain text in the document or contained in CGI scripts are not
# listed in the mailto logfile.
# NOTE: This list MAY contain duplicate entries. Duplication removal may be
# added in later versions.
# (Some people have found this to be a very useful feature for generating
# mailing lists.)
# The default is no mailto logging.
# Mailto-File mailtolist
# RemoteMapping: boolean
# Determines whether remote mapping will be done. The default is TRUE
# while does perform mapping. The map file name is mapindex.html and is
# either located at the root of the LocalPath or in the current directory
# if the system is not mirroring files.
# Note: if you change the default index name, for example, to "welcome.html"
# then the default map file will be "mapwelcome.html".
RemoteMapping TRUE
# Server-File: file name
# A data file is generated containing the host name, IP address, and
# WWW server type for each server visited. For servers listed as IP
# address only, the host name is also the IP address.
# The default is no server logging.
# Server-File serverlist
# ********************* Advanced *****************************
# The advanced configuration commands should be used with caution.
# These commands allow other applications to perform tasks on the
# retrieved documents. Applications that are spawned (operate
# concurrently) with Templeton may overwhelm the user or operating system.
# Spawned applicatons include those begun with "start" under OS/2,
# or followed by "&" under Unix.
# NOTE: Templeton has the capability to spawn thousands of applications
# in a few seconds.
# On Unix-type systems, Templeton introduces security risks when executed
# as root.
# For applications that are not spawned, Templeton will pause until
# the application has ended. This allows for a guarenteed order of processing
# for the called applications.
# Command_html: string
# Execute a system command on each HTML document stored on the file system.
# This may be useful for counting documents, storing statistics, printing,
# converting, etc.
# The string should contain the executable to run and a %s for the file name.
# The string "none" turns off this command. This is the default.
# For example: to convert all HTML documents to text using the program
# html2txt (not provided with the Templeton distribution), you would use:
# Command_html html2txt %s
# Command_image: string
# Execute a system command on each image-file stored on the file system.
# Similar to Command_html, Command_image is executed on all image files.
# This may be useful for counting documents, storing statistics, printing,
# converting, etc. NOTE: no distinction is made between different image
# formats.
# The string should contain the executable to run and a %s for the file name.
# The string "none" turns off this command. This is the default.
# Command_map: string
# Execute a system command on each image-map stored on the file system.
# Similar to Command_html, Command_map is executed on all image-map file.
# This may be useful for counting documents, storing statistics, or converting.
# The string should contain the executable to run and a %s for the file name.
# The string "none" turns off this command. This is the default.
# Command_default: string
# Execute a system command on each file stored on the file system.
# Similar to Command_html, Command_default is executed on all files that have
# no other executable specified. This may be useful for counting documents,
# storing statistics, printing, converting, etc.
# The string should contain the executable to run and a %s for the file name.
# The string "none" turns off this command. This is the default.
# Interactive: boolean
# Determines whether the user should be prompted for
# configuration information or if Templeton should
# start running automatically.
# The default setting is TRUE.